Overview

Dataset statistics

Number of variables11
Number of observations950
Missing cells372
Missing cells (%)3.6%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory81.8 KiB
Average record size in memory88.1 B

Variable types

Numeric7
DateTime2
Categorical2

Alerts

VisitID is highly correlated with PatientMRNHigh correlation
PatientMRN is highly correlated with VisitIDHigh correlation
BloodPressureSystolic is highly correlated with BloodPressureDiastolicHigh correlation
BloodPressureDiastolic is highly correlated with BloodPressureSystolicHigh correlation
VisitID is highly correlated with PatientMRNHigh correlation
PatientMRN is highly correlated with VisitIDHigh correlation
BloodPressureSystolic is highly correlated with BloodPressureDiastolicHigh correlation
BloodPressureDiastolic is highly correlated with BloodPressureSystolicHigh correlation
BloodPressureSystolic is highly correlated with BloodPressureDiastolicHigh correlation
BloodPressureDiastolic is highly correlated with BloodPressureSystolicHigh correlation
VisitID is highly correlated with PatientMRN and 1 other fieldsHigh correlation
PatientMRN is highly correlated with VisitIDHigh correlation
BloodPressureSystolic is highly correlated with BloodPressureDiastolicHigh correlation
BloodPressureDiastolic is highly correlated with BloodPressureSystolicHigh correlation
VisitStatus is highly correlated with VisitIDHigh correlation
BloodPressureSystolic has 124 (13.1%) missing values Missing
BloodPressureDiastolic has 124 (13.1%) missing values Missing
Pulse has 124 (13.1%) missing values Missing
VisitID is uniformly distributed Uniform
VisitID has unique values Unique
DateScheduled has unique values Unique

Reproduction

Analysis started2022-05-24 19:16:24.433851
Analysis finished2022-05-24 19:16:30.697378
Duration6.26 seconds
Software versionpandas-profiling v3.1.0
Download configurationconfig.json

Variables

VisitID
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
UNIFORM
UNIQUE

Distinct950
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean475.5
Minimum1
Maximum950
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size7.5 KiB
2022-05-24T12:16:30.772178image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile48.45
Q1238.25
median475.5
Q3712.75
95-th percentile902.55
Maximum950
Range949
Interquartile range (IQR)474.5

Descriptive statistics

Standard deviation274.3856775
Coefficient of variation (CV)0.5770466403
Kurtosis-1.2
Mean475.5
Median Absolute Deviation (MAD)237.5
Skewness0
Sum451725
Variance75287.5
MonotonicityNot monotonic
2022-05-24T12:16:30.988600image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
2021
 
0.1%
9321
 
0.1%
4721
 
0.1%
8631
 
0.1%
5531
 
0.1%
581
 
0.1%
3371
 
0.1%
8981
 
0.1%
4531
 
0.1%
6771
 
0.1%
Other values (940)940
98.9%
ValueCountFrequency (%)
11
0.1%
21
0.1%
31
0.1%
41
0.1%
51
0.1%
61
0.1%
71
0.1%
81
0.1%
91
0.1%
101
0.1%
ValueCountFrequency (%)
9501
0.1%
9491
0.1%
9481
0.1%
9471
0.1%
9461
0.1%
9451
0.1%
9441
0.1%
9431
0.1%
9421
0.1%
9411
0.1%

PatientMRN
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct273
Distinct (%)28.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean592.6484211
Minimum4
Maximum917
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size7.5 KiB
2022-05-24T12:16:31.105287image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum4
5-th percentile36
Q193.25
median813.5
Q3871
95-th percentile905
Maximum917
Range913
Interquartile range (IQR)777.75

Descriptive statistics

Standard deviation345.1260343
Coefficient of variation (CV)0.5823453198
Kurtosis-1.208921953
Mean592.6484211
Median Absolute Deviation (MAD)92.5
Skewness-0.7357927647
Sum563016
Variance119111.9795
MonotonicityNot monotonic
2022-05-24T12:16:31.219996image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
9021
 
2.2%
8616
 
0.6%
8906
 
0.6%
8726
 
0.6%
8836
 
0.6%
8856
 
0.6%
8666
 
0.6%
8866
 
0.6%
8556
 
0.6%
8816
 
0.6%
Other values (263)875
92.1%
ValueCountFrequency (%)
41
0.1%
51
0.1%
61
0.1%
71
0.1%
81
0.1%
91
0.1%
101
0.1%
111
0.1%
121
0.1%
131
0.1%
ValueCountFrequency (%)
9171
 
0.1%
9164
0.4%
9154
0.4%
9144
0.4%
9134
0.4%
9124
0.4%
9114
0.4%
9104
0.4%
9094
0.4%
9084
0.4%

ProviderID
Real number (ℝ≥0)

Distinct40
Distinct (%)4.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean18.63894737
Minimum1
Maximum40
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size7.5 KiB
2022-05-24T12:16:31.331682image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile2
Q19
median18
Q329
95-th percentile37
Maximum40
Range39
Interquartile range (IQR)20

Descriptive statistics

Standard deviation11.39246969
Coefficient of variation (CV)0.6112185127
Kurtosis-1.204020244
Mean18.63894737
Median Absolute Deviation (MAD)10
Skewness0.1589957996
Sum17707
Variance129.7883656
MonotonicityNot monotonic
2022-05-24T12:16:31.443422image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=40)
ValueCountFrequency (%)
138
 
4.0%
731
 
3.3%
1030
 
3.2%
1329
 
3.1%
328
 
2.9%
628
 
2.9%
3427
 
2.8%
927
 
2.8%
1927
 
2.8%
827
 
2.8%
Other values (30)658
69.3%
ValueCountFrequency (%)
138
4.0%
226
2.7%
328
2.9%
427
2.8%
527
2.8%
628
2.9%
731
3.3%
827
2.8%
927
2.8%
1030
3.2%
ValueCountFrequency (%)
407
 
0.7%
397
 
0.7%
3822
2.3%
3723
2.4%
3623
2.4%
3525
2.6%
3427
2.8%
3326
2.7%
3225
2.6%
3118
1.9%
Distinct124
Distinct (%)13.1%
Missing0
Missing (%)0.0%
Memory size7.5 KiB
Minimum2019-01-01 00:00:00
Maximum2019-05-04 00:00:00
2022-05-24T12:16:31.573071image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-05-24T12:16:31.712663image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

DateScheduled
Date

UNIQUE

Distinct950
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size7.5 KiB
Minimum2018-12-04 12:16:32.105000
Maximum2019-05-01 03:11:43.005000
2022-05-24T12:16:31.834702image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-05-24T12:16:31.960338image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

VisitDepartmentID
Real number (ℝ≥0)

Distinct12
Distinct (%)1.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean7.252631579
Minimum1
Maximum12
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size7.5 KiB
2022-05-24T12:16:32.070045image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q15
median7
Q310
95-th percentile12
Maximum12
Range11
Interquartile range (IQR)5

Descriptive statistics

Standard deviation3.262548672
Coefficient of variation (CV)0.4498434308
Kurtosis-0.9823504559
Mean7.252631579
Median Absolute Deviation (MAD)3
Skewness-0.2646376325
Sum6890
Variance10.64422384
MonotonicityNot monotonic
2022-05-24T12:16:32.153821image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=12)
ValueCountFrequency (%)
11102
10.7%
10100
10.5%
1295
10.0%
893
9.8%
793
9.8%
692
9.7%
592
9.7%
979
8.3%
351
5.4%
451
5.4%
Other values (2)102
10.7%
ValueCountFrequency (%)
151
5.4%
251
5.4%
351
5.4%
451
5.4%
592
9.7%
692
9.7%
793
9.8%
893
9.8%
979
8.3%
10100
10.5%
ValueCountFrequency (%)
1295
10.0%
11102
10.7%
10100
10.5%
979
8.3%
893
9.8%
793
9.8%
692
9.7%
592
9.7%
451
5.4%
351
5.4%

VisitType
Categorical

Distinct4
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Memory size7.5 KiB
Follow Up
336 
Telemedicine
284 
Physical
205 
New
125 

Length

Max length12
Median length9
Mean length8.891578947
Min length3

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowPhysical
2nd rowFollow Up
3rd rowTelemedicine
4th rowTelemedicine
5th rowTelemedicine

Common Values

ValueCountFrequency (%)
Follow Up336
35.4%
Telemedicine284
29.9%
Physical205
21.6%
New125
 
13.2%

Length

2022-05-24T12:16:32.248567image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2022-05-24T12:16:32.312398image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
ValueCountFrequency (%)
follow336
26.1%
up336
26.1%
telemedicine284
22.1%
physical205
15.9%
new125
 
9.7%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

BloodPressureSystolic
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
MISSING

Distinct71
Distinct (%)8.6%
Missing124
Missing (%)13.1%
Infinite0
Infinite (%)0.0%
Mean155.8159806
Minimum120
Maximum190
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size7.5 KiB
2022-05-24T12:16:32.412131image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum120
5-th percentile124
Q1137
median156
Q3174
95-th percentile187
Maximum190
Range70
Interquartile range (IQR)37

Descriptive statistics

Standard deviation20.50874253
Coefficient of variation (CV)0.1316215605
Kurtosis-1.260470802
Mean155.8159806
Median Absolute Deviation (MAD)19
Skewness-0.05825011107
Sum128704
Variance420.6085201
MonotonicityNot monotonic
2022-05-24T12:16:32.548765image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
17719
 
2.0%
18718
 
1.9%
17518
 
1.9%
17017
 
1.8%
14817
 
1.8%
12416
 
1.7%
15216
 
1.7%
13016
 
1.7%
15716
 
1.7%
13616
 
1.7%
Other values (61)657
69.2%
(Missing)124
 
13.1%
ValueCountFrequency (%)
1203
 
0.3%
1219
0.9%
12215
1.6%
1239
0.9%
12416
1.7%
12510
1.1%
12611
1.2%
12711
1.2%
12815
1.6%
12911
1.2%
ValueCountFrequency (%)
1907
 
0.7%
18911
1.2%
18812
1.3%
18718
1.9%
18613
1.4%
1859
0.9%
18410
1.1%
18311
1.2%
1829
0.9%
18113
1.4%

BloodPressureDiastolic
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
MISSING

Distinct826
Distinct (%)100.0%
Missing124
Missing (%)13.1%
Infinite0
Infinite (%)0.0%
Mean106.8412319
Minimum51.6956525
Maximum157.2966831
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size7.5 KiB
2022-05-24T12:16:32.694045image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum51.6956525
5-th percentile68.69983028
Q188.06618526
median107.4602307
Q3123.7401145
95-th percentile144.8957723
Maximum157.2966831
Range105.6010306
Interquartile range (IQR)35.67392922

Descriptive statistics

Standard deviation23.21974425
Coefficient of variation (CV)0.2173294321
Kurtosis-0.7525079825
Mean106.8412319
Median Absolute Deviation (MAD)17.59005166
Skewness-0.04470087476
Sum88250.85755
Variance539.1565231
MonotonicityNot monotonic
2022-05-24T12:16:32.841648image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
144.14342691
 
0.1%
98.371738381
 
0.1%
125.62481441
 
0.1%
95.303929341
 
0.1%
103.66470381
 
0.1%
117.55044161
 
0.1%
121.7997351
 
0.1%
93.271691211
 
0.1%
131.14373121
 
0.1%
120.28094391
 
0.1%
Other values (816)816
85.9%
(Missing)124
 
13.1%
ValueCountFrequency (%)
51.69565251
0.1%
54.09987821
0.1%
54.285241531
0.1%
56.59497271
0.1%
57.043943681
0.1%
57.151788011
0.1%
57.509819241
0.1%
57.692484731
0.1%
58.282064041
0.1%
59.493989681
0.1%
ValueCountFrequency (%)
157.29668311
0.1%
157.24123011
0.1%
155.82206251
0.1%
155.80472851
0.1%
155.65360851
0.1%
155.02912781
0.1%
154.74541581
0.1%
154.03155571
0.1%
153.03111991
0.1%
152.94292181
0.1%

Pulse
Real number (ℝ≥0)

MISSING

Distinct826
Distinct (%)100.0%
Missing124
Missing (%)13.1%
Infinite0
Infinite (%)0.0%
Mean74.46475096
Minimum50.07206773
Maximum99.98646314
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size7.5 KiB
2022-05-24T12:16:32.980278image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum50.07206773
5-th percentile52.76212885
Q162.28239936
median73.64203401
Q386.41759319
95-th percentile97.7159603
Maximum99.98646314
Range49.91439541
Interquartile range (IQR)24.13519383

Descriptive statistics

Standard deviation14.41959553
Coefficient of variation (CV)0.1936432386
Kurtosis-1.164295267
Mean74.46475096
Median Absolute Deviation (MAD)12.18459914
Skewness0.07639526991
Sum61507.88429
Variance207.9247353
MonotonicityNot monotonic
2022-05-24T12:16:33.229519image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
83.044919761
 
0.1%
76.744277451
 
0.1%
70.180206641
 
0.1%
63.93795161
 
0.1%
73.569102931
 
0.1%
52.697325171
 
0.1%
74.301586021
 
0.1%
92.693292811
 
0.1%
91.643264661
 
0.1%
89.287330971
 
0.1%
Other values (816)816
85.9%
(Missing)124
 
13.1%
ValueCountFrequency (%)
50.072067731
0.1%
50.236330651
0.1%
50.277119161
0.1%
50.325058361
0.1%
50.369158941
0.1%
50.373522321
0.1%
50.420059121
0.1%
50.428643741
0.1%
50.447204911
0.1%
50.551146091
0.1%
ValueCountFrequency (%)
99.986463141
0.1%
99.976313061
0.1%
99.892384311
0.1%
99.845948231
0.1%
99.794603691
0.1%
99.767466171
0.1%
99.665285311
0.1%
99.549045341
0.1%
99.544969491
0.1%
99.402681771
0.1%

VisitStatus
Categorical

HIGH CORRELATION

Distinct3
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size7.5 KiB
Completed
736 
No Show
154 
Canceled
 
60

Length

Max length9
Median length9
Mean length8.612631579
Min length7

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNo Show
2nd rowNo Show
3rd rowNo Show
4th rowNo Show
5th rowNo Show

Common Values

ValueCountFrequency (%)
Completed736
77.5%
No Show154
 
16.2%
Canceled60
 
6.3%

Length

2022-05-24T12:16:33.348201image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2022-05-24T12:16:33.420009image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
ValueCountFrequency (%)
completed736
66.7%
no154
 
13.9%
show154
 
13.9%
canceled60
 
5.4%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

Interactions

2022-05-24T12:16:29.414809image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-05-24T12:16:24.629336image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-05-24T12:16:25.441164image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-05-24T12:16:26.157251image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-05-24T12:16:26.963095image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-05-24T12:16:27.762961image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-05-24T12:16:28.675884image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-05-24T12:16:29.509555image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-05-24T12:16:24.727074image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-05-24T12:16:25.538903image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-05-24T12:16:26.264994image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-05-24T12:16:27.076790image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-05-24T12:16:27.872694image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-05-24T12:16:28.779579image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-05-24T12:16:29.607295image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-05-24T12:16:24.925544image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-05-24T12:16:25.638637image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-05-24T12:16:26.367687image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-05-24T12:16:27.190506image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-05-24T12:16:28.092416image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-05-24T12:16:28.876346image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-05-24T12:16:29.713012image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-05-24T12:16:25.034253image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-05-24T12:16:25.748344image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-05-24T12:16:26.493351image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-05-24T12:16:27.319149image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-05-24T12:16:28.215088image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-05-24T12:16:28.985029image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-05-24T12:16:29.811774image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-05-24T12:16:25.136979image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-05-24T12:16:25.853062image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-05-24T12:16:26.608043image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-05-24T12:16:27.429879image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-05-24T12:16:28.328786image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-05-24T12:16:29.102644image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-05-24T12:16:29.923448image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-05-24T12:16:25.242695image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-05-24T12:16:25.958780image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-05-24T12:16:26.735703image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-05-24T12:16:27.544546image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-05-24T12:16:28.450458image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-05-24T12:16:29.212351image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-05-24T12:16:30.024179image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-05-24T12:16:25.341431image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-05-24T12:16:26.058513image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-05-24T12:16:26.853531image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-05-24T12:16:27.656246image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-05-24T12:16:28.570139image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-05-24T12:16:29.314078image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Correlations

2022-05-24T12:16:33.484836image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2022-05-24T12:16:33.626456image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2022-05-24T12:16:33.765087image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2022-05-24T12:16:33.895737image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.
2022-05-24T12:16:34.000458image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Missing values

2022-05-24T12:16:30.205693image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
A simple visualization of nullity by column.
2022-05-24T12:16:30.390200image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2022-05-24T12:16:30.517859image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.
2022-05-24T12:16:30.606621image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
The dendrogram allows you to more fully correlate variable completion, revealing trends deeper than the pairwise ones visible in the correlation heatmap.

Sample

First rows

VisitIDPatientMRNProviderIDDateofVisitDateScheduledVisitDepartmentIDVisitTypeBloodPressureSystolicBloodPressureDiastolicPulseVisitStatus
0202.0840.029.02019-03-192019-03-13 07:59:24.00010.0Physical188.0144.14342783.044920No Show
1436.0820.025.02019-03-192019-02-24 07:10:31.0433.0Follow Up179.0118.87680890.309544No Show
2794.0879.030.02019-04-022019-03-19 01:41:55.65611.0Telemedicine133.098.74956380.859776No Show
3799.0884.037.02019-03-032019-02-25 01:25:39.6964.0Telemedicine132.098.00127082.184737No Show
4515.032.026.02019-03-162019-02-20 06:44:59.61711.0Telemedicine131.085.11563289.192362No Show
5809.0894.031.02019-03-132019-02-28 10:28:01.0912.0Physical121.057.69248560.402573No Show
6491.0875.023.02019-01-092019-01-08 03:49:56.2646.0Follow Up175.0132.88386155.000807No Show
7194.0832.011.02019-03-112019-03-03 15:38:54.1642.0Telemedicine177.0146.23893659.087923No Show
8811.0896.023.02019-03-152019-03-11 05:04:41.5114.0Physical140.074.51983985.051919No Show
9726.0878.016.02019-04-042019-03-10 15:52:39.9629.0Telemedicine183.0122.98249868.800091No Show

Last rows

VisitIDPatientMRNProviderIDDateofVisitDateScheduledVisitDepartmentIDVisitTypeBloodPressureSystolicBloodPressureDiastolicPulseVisitStatus
940705.0857.034.02019-03-142019-03-02 21:48:23.16612.0Follow Up163.098.70646591.345959Canceled
941399.0638.011.02019-02-102019-02-02 20:49:56.8682.0Physical121.064.89681868.496869Canceled
942392.0617.04.02019-02-032019-02-01 02:50:08.3777.0Telemedicine185.0127.41932184.645360Canceled
943403.0650.015.02019-02-142019-02-11 00:47:42.2166.0Physical129.068.50663662.732434Canceled
944484.0868.016.02019-01-022018-12-19 12:12:58.3007.0Follow Up178.0110.14511476.321346Canceled
945788.0873.01.02019-03-272019-03-10 17:19:48.2645.0Telemedicine187.0126.69514170.049167Canceled
946703.0855.036.02019-03-122019-02-19 02:27:52.66410.0Follow Up182.0140.70350273.380812Canceled
947481.0865.013.02019-05-032019-04-25 13:37:57.50112.0New150.084.84119761.625454Canceled
948398.0635.010.02019-02-092019-02-08 11:40:35.1371.0Physical177.0125.03753453.287036Canceled
949695.0847.024.02019-03-042019-02-05 05:29:30.58712.0Follow Up165.0116.87931856.180317Canceled